OLLIE: On-Line Learning For Information Extraction

نویسندگان

  • Valentin Tablan
  • Kalina Bontcheva
  • Diana Maynard
  • Hamish Cunningham
چکیده

This paper reports work aimed at developing an open, distributed learning environment, OLLIE, where researchers can experiment with different Machine Learning (ML) methods for Information Extraction. Once the required level of performance is reached, the ML algorithms can be used to speed up the manual annotation process. OLLIE uses a browser client while data storage and ML training is performed on servers. The different ML algorithms use a unified programming interface; the integration of new ones is straightforward.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Open Language Learning for Information Extraction

Open Information Extraction (IE) systems extract relational tuples from text, without requiring a pre-specified vocabulary, by identifying relation phrases and associated arguments in arbitrary sentences. However, stateof-the-art Open IE systems such as REVERB and WOE share two important weaknesses – (1) they extract only relations that are mediated by verbs, and (2) they ignore context, thus e...

متن کامل

Open Information Extraction with Tree Kernels

Traditional relation extraction seeks to identify pre-specified semantic relations within natural language text, while open Information Extraction (Open IE) takes a more general approach, and looks for a variety of relations without restriction to a fixed relation set. With this generalization comes the question, what is a relation? For example, should the more general task be restricted to rel...

متن کامل

Open Information Extraction via Contextual Sentence Decomposition1

We show how contextual sentence decomposition (CSD), a technique originally developed for high-precision semantic search, can be used for open information extraction (OIE). Intuitively, CSD decomposes a sentence into the parts that semantically “belong together”. By identifying the (implicit or explicit) verb in each such part, we obtain facts like in OIE. We compare our system, called CSD-IE, ...

متن کامل

ON-LINE SOLID-PHASE EXTRACTION AND LIQUID CHROMATOGRAPHY/PARTICLE BEAM-MASS SPECTROMETRY FOR DEGRADATION STUDIES OF SOME POLAR PESTICIDES IN WATER

An on-line automated method for photodegradation studies of isoproturon, diuron, atrazine, fenitrothion, and metoxuron by means of liquid chromatography/mass spectrometry (LC/MS) with particle beam (PB) interface is described. Surface water samples were first spiked with 50 µg/l of each pesticide and then exposed to the radiation of the medium-pressure mercury lamp. Next, in regular intervals o...

متن کامل

Phishing website detection using weighted feature line embedding

The aim of phishing is tracing the users' s private information without their permission by designing a new website which mimics the trusted website. The specialists of information technology do not agree on a unique definition for the discriminative features that characterizes the phishing websites. Therefore, the number of reliable training samples in phishing detection problems is limited. M...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003